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PATENT 



Docket No. GC 382 



PROTEASES FROM GRAM-POSITIVE ORGANISMS 



FIELD OF THE INVENTION 



The present invention relates to serine proteases derived from gram-positive 
microorganisms. The present invention provides nucleic acid and amino acid 
sequences of serine protease 1. 2, 3, 4 and 5 identified in Bacillus. The present 
invention also provides methods for the production of serine protease 1, 2. 3. 4 and 5 
In host cells as well as the production of heterologous proteins in a host cell having a 
mutation or deletion of part or all of at least one of the serine proteases of the 
present invention. 



Gram-positive microorganisms, such as members of the group Bacillus, have 
been used for large-scale industrial fermentation due, in part, to their ability to 
secrete their fermentation products into the culture media. In gram-positive bacteria, 
secreted proteins are exported across a ceil membrane and a cell wall, and then are 
subsequently released into the external media usually maintaining their native 
conformation. 

Various gram-positive microorganisms are known to secrete extracellular 
and/or intracellular protease at some stage in their life cycles. Many proteases are 
produced in large quantities for industnal purposes. A negative aspect of the 
presence of proteases in gram-positive organisms is their contribution to the overall 
degradation of secreted heterologous or foreign proteins. 

The classification of proteases found in microorganisms is based on their 
catalytic mechanism which results in four groups: the serine proteases; 
metalloproteases; cysteine proteases; and aspartic proteases. These categones 
can be distinguished by their sensitivity to various inhibitors. For example, the senne 
proteases are inhibited by phenylnnethyisutfonylfluoride (PMSF) and 
diisopropylfluorophosphate (DIFP); the metalloproteases by chelating agents; the 
cysteine enzymes by iodoacetamide and heavy metals and the aspartic proteases by 
pepstatin. The senne proteases have alkaline pH optima, the metalloproteases are 
optimally active around neutrality, and the cysteine and aspartic enzymes have 



BACKGROUND OF THE INVENTION 



GC382 



10 



15 



20 



- 2 - 

GC 382 



acidic pH optima fp-.^>^^Hnn.n^Y M.nrihnoks. Bacillus, vol. 2. edited by Harwood, 
1989 Plenum Press, New York). 

Proteolytic enzymes that are dependent upon a serine residue for catalytic 
activity are called serine proteases. As described in Methods in Enzymology, vol. 
244. Academic Press, Inc. 1994, page 21 , serine proteases of the family S9 have the 
catalytic residue triad "Ser-Asp-His with conservation of amino acids around them. 

SUMMARY Q P THF INVENTION 

The present invention relates to the unexpected discovery of five heretofore 
unknown or unrecognized S9 type serine proteases found in uncharacterized 
translated genomic nucleic acid sequences of Bac///us subtilis. designated herein as 
SP1, SP2, SP3, SP4 and SP5 having the nucleic acid and amino acid as shown in 
the Figures. The present invention is based, in part, upon the presence the amino 
acid triad S-D-H in the five serine proteases, as well as amino acid conservation 
around the triad. The present invention is also based in part upon the heretofore 
uncharactenzed or unrecognized overall amino acid relatedness that SP1 , SP2, SP3, 
SP4 and SP5 have with the serine protease dipeptidyl-amino peptidase B from 

yeast (DAP) and with each other. 

The present invention provides isolated polynucleotide and amino acid 
sequences for SPI , SP2. SP3. SP4 and SP5. Due to the degeneracy of the genetic 
code the present invention encompasses any nucleic acid sequence that encodes 
the SPI SP2, SP3, SP4 and SP5 deduced amino acid sequences shown in Figures 

2A-2B-Fiaure 6. respectively 

The present invention encompasses amino acid variations of B.subtilis SPi, 
SP2. SP3 SP4 and SP5 amino acids disclosed herein that have proteolytic activity. 
B sjbtil.s SPI SP2. SP3. SP4 and SP5 as well as proteolytically active amino acid 
vana.oni thereo' have application in cleaning compositions. The present invention 
also encompasses ammo acid variations or denvatives of SPI, SP2. SP3, SP4 and 
SP-. that do not nave the characteristic proteolytic activity as long as the nucleic 
acd seouences encod,ng sucn variations or derivatives would have sufficient 5' and 
3' codinc reaions to be capable of integration into a gram-positive organism genome. 
Such vanants would have applications in gram-positive expression systems where it 
,s des.rabie to delete, mutate, alter or othenv.se incapacitate the naturally occurring 
serine protease m order to diminish or delete its proteolytic activity. Such an 
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expression system would have the advantage of allowing for greater yields of 
recombinant heterologous proteins or polypeptides. 

The present invention provides methods for detecting gram positive 
microorganism homologs of B. subtilis SP1. SP2, SP3. SPA and SP5 that comprises 
hybndizing part or all of the nucleic acid encoding S. subtilis SP1 , SP2, SP3. SPA or 
SP45 with nucleic acid derived from gram-positive organisms, either of genomic or 
cDNA origin. In one embodiment, the gram-positive microorganism is selected from 
the group consisting of B. ficheniformis, B, lentus, B, brevis, B. stearothermophilus, 
S. alkafophilus, B. amyloliquefeciens, B. coagulans, S. circulans. B. lautus and 
Bacillus thuringiensis. 

The production of desired heterologous proteins or polypeptides in gram- 
positive microorganisms may be hindered by the presence of one or more proteases 
v/hich degrade the produced heterologous protein or polypeptide. One advantage of 
the present invention is that it provides methods and expression systems which can 
be used to prevent that degradation, thereby enhancing yields of the desired 
heterologous protein or polypeptide. 

Thus, in another aspect, the present invention provides a. gram-positive 
microorganism having a mutation or deletion of part or all of the gene encoding SP1 
and/or SP2 and/or SP3 and/or SP4 and/or SP5 which results in inactivation of their 
proteolytic activity, either alone or in combination with mutations in other proteases, 
such as apr, npr, epr, mpr for example, or other proteases known to those of skill in 
the an. In one embodiment of the present invention, the gram-positive organism is a 
member of the genus Bacillus. In another embodiment, the Bacillus is Bacillus 
subtilis. 

in yet another aspect, the gram-positive host is genetically engineered to 
produce a desired protein In one embodiment of the present invention, the desired 
protein is heterologous to the gram-positive host cell. In another embodiment, the 
desired protein is homologous to the host cell. The present invention encompasses a 
aram-positive host cell having a deletion or interruption of the nucleic acid encoding 
the naturally occurring homologous protein, such as a protease, and having nucleic 
acid encoding the homologous protein re-introduced in a recombinant form. In 
another embodiment, the host cell produces the homologous protein. Accordingly, 
the present invention also provides methods and expression systems for reducing 
degradation of heterologous proteins produced in gram-positive microorganisms. 
The gram-positive microorganism may be normally sporulating or non-sporulating. 



GC382 



- 4 - 
GC 382 



,„ a fuaher aspac, o. .he present Invention, grann-po,i.ive SP1 SP2^ SP3. 
SP. or SP5 is produce, on an ,nOu,.r,a, ,er.e„.a.on scale in a 
Ipression sy^ten,. ,n ancner aspect, tsotated and purir,ad recon,.,nan. SPr SP2. 
SpTsP4 or SP5 is used In con.pos«ions o. n,a«er intended ,0, deaning purposes. 
3 such as detergents. Aocordinjly. the present inven«on provides a 

contposition comprising one or ntore o, e gra.-posi..ve senne protease seiec d 
Z me group consisting o, SP1, SP3. SP4 and SP.5. The senne protease 
Z - uld alone or In con,hina.,on with other en.y.es and,or med,a.ors or 



enhancers. 



p,,,as 1A-1C shov,s .he DNA and deduced antino acid sequence for SP1 (YUXU- 

.i,ure 2A-2B show an amino acid alignment between DAP (dap2^east, and SPt 
„ 7U For Figures 2A-2B. 3 and 4, the amino acid triad S-D-H ,s indicated, 

P„ure 3 Shows an amino a.d alignment between SP, (VUXL, and SP2 (VTMA), 
Pi,ure 4 Shows and am.no acd alignment beNveen SP1 (YUXL, and SP3 (Yl^)- 
" p,ure a Shows and am.no acid alignment between SP1 (YUXU and SP4 (YOKD, 
P„ure e shows and amino acid alignment between SPt (YUXL) and SP5 (CAH,, 
Figures 7A-7B shows the DNA and deduced amino acd sepuence for SP2 (YTMA,. 
Figures 8A-SB shows the DNA and deduced am.no acid seguence .or SP3 ,YIW,. 
Figures 9A.9B shows ,he Dr.A and deduced amino acid seguence .or SP4 CrQKD) 
ggjULEDDESSBlPTlON^ETHF PREFERREOaaBOBmEMIS 

Deftnitions *u^co r^^ 
^Tused here,n. the genus Bac^lus includes all members known to those ol 

SK,I, in the ar,, inc,ud,ng bu, not llm.ed to 6. suM.V.s, B, „cher,/rorm,s, 8, l.nt.s. 
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brevis, B. stearothermophHus. B. alkalophiius, B. amyloliquefaciens. 8. coagulans, B. 
ciculans, B. lautus and B. thuringiensis. 

The present invention encompasses novel SP1. SP2. SP3, SP4 and SP5 
from gram positive organisms, in a preferred embodiment, the gram-positive 
organisms is a Bacillus. In another preferred embodiment, the gram-positive 
organism is Bacillus subtilis. As used herein, "B.subtilis SP1 (YuxL) refers to the 
DNA and deduced amino acid sequence shown in Figures 1A-1C and Figures 2A- 
2B; SP2 (YtmA) refers to the DNA and deduced amino acid sequence shov^n in 
Figures 7A-7B and Figure 3; SP3 (YitV) refers to the DNA and deduced amino acid 
sequence shown in Figures 8A^B and Figure 4; SP4 (YqkD) refers to the DNA and 
deduced amino acid sequence shown in Figures 9A-9B and Figure 5; and SP5 
(CAH) refers to the deduced amino acid sequence shown in Figure 6. The present 
invention encompasses amino acid variations of the B.subtilis amino acid sequences 
of SP1 SP2, SP3, SP4 and SP5 that have proteolytic activity. Such proteolytic 
amino acid variants can be used in cleaning compositions. The present Invention 
also encompasses B, subtilis amino acid variations or derivatives that are not 
proteoiytically active. DNA encoding such variants can be used in methods designed 
to delete or mutate the naturally occuning host cell SP1, SP2, SP3, SP4 and SP5. 

As used herein, *'nucleic acid" refers to a nucleotide or polynucleotide 
sequence, and fragments or portions thereof, and to DNA or RNA of genomic or 
synthetic origin which may be double-stranded or single-stranded, whether 
representing the sense or antisense strand. As used herein "amino acid" refers to 
peptide or protein sequences or portions thereof. A "polynucleotide homolog" as 
used herein refers to a novel gram-positive microorganism polynucleotide that has at 
least 80%. at least 90% and at least 95% identity to B.subtilis SP1, SP2. SP3. SP4 
or SP5, or which is capable of hybridizing to B.subtilis SP1 . SP2, SP3, SP4 or SP5 
under conditions of high stnngency and which encodes an amino acid sequence 
having serine protease activity. 

The terms "isolated" or "purified" as used herein refer to a nucleic acid or 
ammo acid that is removed from at least one component with which it is naturally 
associated. 

As used herein, the term "heterologous protein" refers to a protein or 
polypeptide that does not naturally occur in a gram-positive host cell. Examples of 
heterologous proteins include enzymes such as hydrolases including proteases, 
ceilulases, amylases, carbohydrases. and lipases; isomerases such as racemases. 
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10 



,„i„.rase. ,au,c,rerases, or mutases: transferases, kinases and phopha.ases^ 
" o,o,ous gene r.a, encc.e ,.=rapeu.-,. sl.n.can, proteins or pep„.es, 
K a" rol faotors. cytokines, ii.ands. receptors and inhibitors, as wei, as 
i s and an„.odies. T.e .ene may encode co.mer^aiiy important ,ndus.,a, 
p ols or peptides, s.o. as proteases, car.onydrases suoH as 3my,ase= and 
. Lloamyiases. oeiioiases. o.da=es and iipases. The gene o. interest may .e a 
naturally occurring gene, a mutated gene or a synthetic gene. 

The ten. -homologous protein" rete. to a protein or polypeptide native or 
naturally occumng in a gram-positive host ce«. The invention includes host cells 
p Ic, g the homologous protein via recomPinant DNA technology. T e pre t 
::nt,on encompasses a gram-positive host cell having a deletion or ,n er.up.on of 
the hudeic acid encoding the naturally occurring homologous protein, such as 
protease and having nucleic aoid encoding the homologous protein re-introduced ,n 
mhi ant tomt. in another emPodiment. the host cell produces the homologous 



15 protein 



20 



30 



AS used herein, the t.m, "overe^pressing- when referring to the production of 
a profein in a host eel, means that the protein is produced in greater amounts than ,ts 
production in its naturally occurring environment. 

^ used hereto, the phrase ••pro.colyhc acUvity" refers ,0 a proteto Utatis able 
,o hydrolyze a pephde bond. Enz^^es hsvmg proteo^^c ac vrty are descnhed ,n 
Enz5-me Nomenclature. 1 992. edited Webb Academic Press, Inc. 

p^^^li^^D^ ption o, the Prefe rieaEngbcdimena „^ 

:;;ZZ^;^^;JZo^oin.. serine proteases SP1. SP2. SP3, SPA and 
SP5 in B SUM.V,S provides 3 has.s for producing host cells, express.on methods and 
!lms Which can be used to prevent the degradation of recomhinantly produced 
e : gous proteins In a preferred embod.ment, the host cell is a gram-pos ,ve 
HO e« .ha. has a reduCon or muta.ron ,n the naturally occu.,ng senne protease 
mtLon result., . .he complete deietion or inact,vat,on of the produCon by 
,.e host cell o, the proteolyiic serine protease gene product. In another 
embod,ment of the present invention, the host eel, is add.tionally gene.,eall, 
engineered to produced a desired protein or polypept.de. 

„ may a,so be desired to gene.ica„y engineer host ce„s of any type to 

CD-I c;P9 «^P3 SP4 or SP5. Such host 
produce a granvposU,ve serine protease SP1. SP2, SP3, 
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cells are used in large scale fermentation to produce large quantities of the serine 
protease which may be isolated or purified and used in cleaning products, such as 
detergents. 

I. Serine orotease Sequences 
5 The SP1 , SP2, SP3 and SPA polynucleotides having the sequences as 

shown in the Figures encode the Bacillus subtilis serine SP1, SP2, SP3, and SP4. 
As will be understood by the skilled artisan, due to the degeneracy of the genetic 
code, a variety of polynucleotides can encode the Bacillus SP1, SP2, SP3, SP4 and 
SP5. The present invention encompasses all such polynucleotides. 

10 The present invention encompasses novel SP1. SP2, SP3, SP4 and SP5 

polynucleotide homologs encoding gram-positive microorganism serine proteases 
SP1 , SP2, SP3. SP4 and SP5, respectively, which have at least 80%, or at least 
90% or at least 95% identity to subtilis as long as the homolog encodes a protein 
^ that has proteolytic activity. 

15 Gram-positive polynucleotide homologs of B. subtilis SP1, SP2, SP3, SP4 or 

SP5 may be obtained by standard procedures known in the art from, for example, 
cloned DNA (e.g., a DNA "library"), genomic DNA libraries, by chemical synthesis 
once identified, by cDNA cloning, or by the cloning of genomic DNA, or fragments 
thereof, purified from a desired cell. {See, for example, Sambrook et ai, 1989. 

20 Molecular Cloning, A Laboratory Manual. 2d Ed., Cold Spring Harbor Laboratory 
Press. Cold Spring HariDor, New York; Glover. D.M. (ed.), 1985, DNA Cloning: A 
Practical Approach, MRL Press, Ltd., Oxford, U.K. Vol. 1. IL) A preferred source is 
from genomic DNA. Nucleic acid sequences derived from genomic DNA may 
contain regulatory regions in addition to coding regions. Whatever the source, the 

25 isolated senne protease gene should be moieculariy cloned into a suitable vector for 
propagation of the gene. 

In the molecular cloning of the gene from genomic DNA, DNA fragments are 
generated, some of which will encode the desired gene. The DNA may be cleaved at 
specific sites using vanous restriction enzymes. Alternatively, one may use DNAse in 

30 the presence of manganese to fragment the DNA. or the DNA can be physically 
sheared, as for example, by sonication. The linear DNA fragments can then be 
separated according to size by standard techniques, including but not limited to, 
agarose and polyacrylamide gel electrophoresis and column chromatography. 

Once the DNA fragments are generated, identification of the specific DNA 

35 fragment containing the SP1 , SP2. SP3, SP4 or SP5 may be accomplished in a 

number of ways. For example, a 8. subtilis SP1 , SP2, SP3. SP4 or SP5 gene of the 
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10 



15 



a gram-positive SP1 , SP2 SP3, p,.. Acad- ScLUS^ 

Accordingly, .ha pres. homologs wh,ch 

gra..pos«lve S;^:^^;^^^^;^^;: ^..^Cd sequence o, a SP1. SP2, 

;;T:;:rprr;:::i--™^ 

-o ,..ded scope ^^1^:^:^^ 

..croorganis. polynucleo.ide ^„,,<,„s of 

.c,eo.,desec.e„ceo,a.^^^^^^^^^ 

confer a defined -stringency- as explained below. 

-Maxin,un, stnngency- typically occurs at about Tm-S C (5 C belo 
™- « about 5-C to 1 0'C below Tm; "intermediate 

•'"^ ^-"^r^-rrr 0 r^elow and -low stnngency at about .CO to 

25=C below Tm w, ^^^^^^ .^^^^.^^ polynucleotide 

. :::::: stnnglcy nybridi.„on can be used to 

. r.,.ri*:*n*irie seauence homologs. 
; e-^ ■hybridization- as used Herein snail include .be process by w.cH a 
■ c SIC ac d ,oins with a complementary strand through base pairing 
srranc o< nucleic acid ,0 ^^^^^^^^ jtockton Press. New Yor^ NY) 

' ' [Z:;:^^^ out in polymerase chain react,on 

"^he process ot ampimudvt*-' .-ooc prR 

. oescnoed ,n -enb^^^^^^^ rY Tnu^c 

----if::7:^our::re:t= 

3c,o sequence 0 a le, ^^^^^^^^^ ^^^^ ^^^^^ 

from 5 suDlil... SPl , s--^. „ k, as a probe or PCR pnmer. 

35 ano more p-s.crably about 20-25 nucleotides can be used 
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The B.subtilis amino acid sequences SP1. SP2. SP3. SPA and SP5 (shown in 
Figures 2A-2B through Figure 6) were .dentified via a FASTA search of Bacillus 
subtilis genomic nucleic acid sequences. 8. subtilis SP1 (YuxL) was identified by its 
structural homology to the serine protease DAP classified as an S9 type serine 

5 protease, designated in Figures 2A-2B as ■•dap2_yeast". As shown in Figures 2A- 
2B, SP1 has the amino acid dyad "S-D-H" indicated. Conservation of amino acids 
around each residue is noted in Figures 2A-2B through Figure 6. SP2 (YtmA); SP3 
(YitV); SP4 (YqkDO and SP5 (CAH) were identified upon by their stnjctural and 
overall amino acid homology to SP1 (YuxL). SPI and SP4 were described in Parsot 

10 and Kebayashi, respectively, but were not characterized as serine proteases or 
serine proteases of the S9 family. 

II Fxpression Systems 

The present invention provides host cells, expression methods and systems 
IS for the enhanced production and secretion of desired heterologous or homologous 
proteins in gram-positive microorganisms. In one embodiment, a host cell is 
genetically engineered to have a deletion or mutation in the gene encoding a gram- 
positive SP1. SP2, SP3, SP4 or SP5 such that the respective activity is deleted. In 
an alternative embodiment of the present invention, a gram-positive microorganism 
20 is genetically engineered to produce a serine protease of the present invention. 
Inar.tivation of a qram-DOSitive serine oro tease in a host cell 
Producing an expression host cell incapable of producing the naturally 
occurring serine protease necessitates the replacement and/or inactivation of the • 
naturally occurnng gene from the genome of the host cell. In a preferred 
25 embodiment, the mutation is a non-reverting mutation. 

One method for mutating nucleic acid encoding a gram-positive serine 
protease "is to clone the nucleic acid or part thereof, modify the nucleic acid by site 
directed mutagenesis and reintroduce the mutated nucleic acid into the cell on a 
Diasmid. By homologous recombination, the mutated gene may be introduced into 
30 the chromosome. In the parent host cell, the result is that the naturally occurnng 

nucleic acid and the mutated nucleic acid are located in tandem on the chromosome. 
After a second recombination, the modified sequence is left in the chromosome 
having thereby effectively introduced the mutation into the chromosomal gene for 
progeny of the parent host cell. 
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Another method for inact.vating the serin, protease proteolytic activity is 
through .etetih, the chro.oso.a, .ene copy^ in a preterre. e.Podinnen., the ent.re 
gene s Ceieted. the Peietton occu^ing in such as way as to maKe reversion 
Lpossibte^ ,n another preterrad emPodi.ent, a pania, deietion is produ^d 
, provided that the nuCeic acid scuence left in the chro.osonne is ,00 shp« . r 
hontologous reccnthination with a plasmid encoded senna protease gene. n 
another preferred embodiment, nucleic a.d encoding the catalytic amino acd 

residues are deleted. 

Deletion of the naturally occurring grarr^-positive microorganism serine 
.n carried out as follows. A serine protease gene including its 5' and 3 

. vector, me coding region of the senne 

;ras. g-e ,s deleted fom, the vector . v,t,o, leaving hehind a suf^cient amoon 

r • and y .anting sepuances to provide for homologous recomP,nat,cn w„h he 
n u:aroccurring gene in the parent host cali. The vector is then uansfo^ed into 
„ tna gram-positive host oe«. The vector integrates into the chromosome via 

hlrogous recomhination in the «an.lhg regions. This method ieads to a gram- 
posltive strain in Which the protease gene has been deleted^ 

The vector used in an integration method is preferably a plasmid. A 
selectable marKer may be included to allow for ease of Identification of desired 
ecomb,n3nt microorganslms. Add,.lonally. as will ba appre.3,ad by one o, s.,1 ,n 
„e ar, the vector Is preferably one which can be selectively tntegrated , to t o 
hromcsome. Th,s can be achieved by Introducing ah inducible ohgin 0, repl.ation. 
for example, a temperature sansltive ohgin into the piasmld. By grow.ng the 

nsformants a. a temperature to which the origin o, replication ,s sens.tive e 
; ,oat,on funcon o, the plasmid Is Inactivated, thereby providing a means fo 
slcion o, Chromosomal Integrants, Integrants may be selected for growth at h,gh 
Lmperatu-res In the presence o, the seieCabie marKer. such as an an„b,ot,c. 

'"-t:e::==^^^^^ 
:r::::::cr::e= 

complete gene has been deleted, such as through nucleic acid seguencng or 



restriction maps. 
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Another n,e,h„d of in.aivaf.ng .he n.tura,., occurring serine protease gene 
is ,0 mutagenize .he chron^osoma, gene copy h, .rans,or„„hg a gram-posa.ve 
Microorganism with oiigonucleo.ides which are muragenic. A«ema,ive,y, .h. 
Chromosomal serine protease gene can be repiacer. with a mutan, gene b, 

homologous recombination. 

The present .nvantion encompasses host ceils having additional protease 
deletions or mutations, such as deletions or mutations in apr, npr, epr, mpr and 
others known to those of skill in the art. 

Ill Prnriurtion o ^ '^^^rinp protease 

For production of senna protease in a hos. cell, an expressior, vector 

comprising at least one copy o, nucleic acid encoding a gran,-pos„ive microorganism 
SP-l SP2 SP3 SP4 or SP5. and preferably comprising multiple copras. ,s 
.ransformed into the host cell under conditions suitable for expression of the senna 
..otease. In accordance with the present Invention, polynucleotides which encode a 
gram-posltlve microorganism SP1. SP2. SP3, SP4 or SP5, or fragments thereof, or 
fusion proteins or polynucleotide homolog seouences that encode amino acd 
vanants of B, SP, , SP2, SP3, SP. or SP5. may be used to generate recombinant 
DNA molecules thai direct thelr expression In host cells. In a preferred embodrmen., 
,ne gram-pos„lve host eel, belongs to the genus eac/»us. In another preferred 
embodiment, the gram positive host cell is B, suW*s. ,o 
AS will be understood by those of skill in the ari, it may be advantageous to 
produce polynucleotide sequences possessing non-naturally occurring codons, 
"dons preLed by a particular gram-posltive host cell (Murray E et al ,1 gsg, .uc 
Acids Res 1 7:477.508) can be selected, for example, to Increase the rate of 
expression o, to produce recombinant RNA transcripts having desirable properties, 
such as alonger half-life, than transcnpts produced from naturally occurr.ng 

Altered SP1 SP2, SP3, SP4 or SP5 polyhucleotlde sequences whtch may 
be used ,n accordance with the Invent.on include deletions. Insertrons or substUuhons 
o, different nucleotide residues resuu.ng ,n a polynucleotide that encodes the am 
„ . functionally eou.valent SP1, SP2. SP3, SP4 or SP5 homolog. respeCvely, As 

nerein a ■■deletion.^ Is def.nad as a change in either nucleotide or am,no ac,o 
secuence in wh,ch one or more nucleotides or amino acd residues, respectively, are 
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AS used herein an "insertion- or "addition- is that change in a nuclao„de or 
amino acid sequence which has rescued in .he addition o, one or .ore nudeoMes 
or amino acid residues, respectively, as compared to the naturally occurnng SPt , 

SP2 SP3, SP4 or SP5. 

AS used herein "substitution" results from the replacemer^t of one or more 
nucleotides or amino acids by different nucleotides or amino acids. respect,vely. 

The encoded protein may also show deletions, insertions or substitutions of 
amino acid residues which produce a silent change and result in a functior^ally SP1 , 
SP2 SP3 SP4 or SP5 variant. Deliberate amino add substitutions may be made 
on the basis of similarity in polarrty, charge, solubility, hydrophobicity, hydrophilicty, 
and/or the amphipathic nature of the residues as long as the variant retains the 
ability to modulate secretion. For example, negatively charged amino ac.ds .nCude 
aspartic acid and glutamic acid; positively charged amino acids include lys.ne and 
arginine and amino acids with uncharged polar head groups having similar 
.'drophiLcity values include leucine, isoleucine. valine; glycine, alanine; asparag,ne, 
glutamine; senne, threonine, phenylalanine, and tyros.ne. 

The SP1 SP2 SP3, SP4 or SP5 polynucleotides of the present invention 
.ay be engineered in order to modify the cloning, processing and/or expression of 
the gene product. For example, mutations may be introduced using techniques 
Which are well Known in the an, eg, site-directed mutagenesis to insert new 
restriction sites, to alter g.ycosylation patterns or to change codon preference, for 
example. 

,n one embodiment of the present invention, a gram-positive microorganism 
SP1 SP2 SP3 SP4orSP5 polynucleotide may be ligated to a heterologous 
sequence' to encode a fus>on protein, A fusion protein may aiso be engineereo to 
contain a cleavage site located between the serine protease nucleotide sequence 
and the heterologous protein sequence, so that the senne protease nay be Ceaveo 
and purified away frorr. the heterologous moiety. 



•50 IV \/'= /-tnr SfiQuences 

usee ,0 axpras.ng .he seHne proteases o< the present 
,„„,„„on in ,ran,-pos„... .,croorganisms =on,prise a. leas, one promoter 
assoca.ed w,th a senne protease selected from the group cons.st.ng o, SPt^ SP2. 
SP3 SP. and SP5. wh,cn o:or.o,er ,s functional in the host ce„ In one en,boo,nen, 
o. the prasen, ,nven„pr,. tne pron,o,er is the w,ld-„pe promoter for the seleoed 
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serine protease and in another embodiment of the present invention, the promoter is 
heterologous to the serine protease, but still functional in the host cell. In one 
preferred embodiment of the present invention, nucleic acid encoding the serine 
protease is stably integrated into the microorganism genome. 

5 In a preferred embodiment, the expression vector contains a multiple cloning 

site cassette v^hich preferably comprises at least one restriction endonuclease site 
unique to the vector, to facilitate ease of nucleic acid manipulation. In a preferred 
embodiment, the vector also comprises one or more selectable markers. As used 
herein, the term selectable marker refers to a gene capable of expression in the 

] 0 gram-positive host which allows for ease of selection of those hosts containing the 
vector. Examples of such selectable markers include but are not limited to 
antibiotics, such as. erythromycin, actinomycin, chloramphenicol and tetracycline. 

V. Transformation 

, 5 A vanety of host cells can be used for the production of SP1 . SP2. SP3. SP4 

or SP5 including bacterial, fungal, mammalian and insects cells. General 
transformation procedures are taught in Current Protocols In Molecular Biology (vol. 
1 , edited by Ausubel et al., John Wiley & Sons, Inc. 1987. Chapter 9) and include 
calcium phosphate methods, transformation using DEAE-Dextran and 
20 electroporation. Plant transformation methods are taught in Rodriquez (WO 
95/14099, published 26 May 1995). 

In a preferred embodiment, the host cell is a gram-positive microorganism 
and in another preferred embodiment, the host cell is Bacillus. In one embodiment 
of the present invention, nucleic acid encoding one or more serine protease(s) of the 
25 present invention is introduced into a host cell via an expression vector capable of 
replicating within the host cell. Suitable replicating plasmids for Bacillus are 
described ,n Molecular Biological Methods for Bacillus, Ed. Harwood and Cutting, 
John Wley & Sons, 1990, hereby expressly incorporated by reference; see chapter 3 
on plasmids. Suitable replicating plasmids for B. subtilis are listed on page 92. 
3.0 in another embodiment, nucleic acid encoding a senne protease(s) of the 

present invention ,s stably integrated into the m.croorganism genome. Preferred 
host cells are gram-positive host cells. Another preferred host is Bacillus. Another 
preferred host is Bacillus subtilis. Several strategies have been described in the 
literature for the direct cloning of DNA in Bacillus. Plasmid marker rescue 
;5 transformation involves the uptake of a donor plasmid by competent cells carrying a 
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partially homologous resident p.asmid (Cor^tente ei a/., PIssm,, 2:555-571 (1979); 
Haima ef a/ Mo/. Gen. Genet. 223:18^191 (1990)-. Weinrauch et a/.. J. Bactenol. 
,5.r3;-1077-1087 (1983); and Weinrauch a/.. J. Bac/eno,. 169f3;:l205-121 1 
(,987)) The incoming donor plasmid recombines with the homologous region of the 
resident "helper" plasmid in a process that mimics chromosomal transformation. 

Transformation by protoplast transformation is described for B. suttiUs .n 
Chang and Cohen, (1979) Mol. Gen. Genet 168:111-115; for B.megaterium .n 
Vorobjeva et al., (1980) FEMS Microbiol. Letters 7:261-263; for B. amyloiiquefaciens 
in smith et al.. (1986) Appi. and Env. Microbiol. 51:634; for B.thuringiensis in Fisher 
et al (1981) Arch. Microbiol. 139:213-217; for B.sphaericus in McDonald (1984) J. 
Gen Microbiol. 130:203; and B.iarvae in Bakhiet et al.. (1985) 49:577. Mann et al., 
(1986 Current Microbiol. 13:131-135) report on transformation of Bacllus 
protoplasts and Holubova. (1985) Folia Microbiol. 30:97) disclose methods for 
introducing DNA into protoplasts using DNA containing liposomes. 

VI jd entification of Transforma nts 

Whether a host cell has been transformed with a mutated or a naturally 

^ r^o=i♦iv/P <?Pi SP2 SP3 SP4 or SP5, detection of 
occurring gene encoding a gram-positive SP1 . bK^, o 

the presence/absence of marKer gene expression can suggests whether the gene of 
,nter.s. is present However, its expression should be confirmed. For example, if the 
nucle,c aoo encoding a senne protease is inserted within a marker gene sequence, 
recombinant cells containing the insert can be identified by the absence of marker 
...e fun-tion Alte'nat.ely. a marker gene can be placed in tandem with nucleic 
acH encodna me senne protease under the control of a single promoter. 
Expressio^, of th. ma-ker gene n response to induction or selection usually indicates 
expr--sion of the seme protease as well. 

Alternatively host cells which contain the coding sequence for a serine 
protease and express tne orotein may be identified by a variety of procedures known 
those 0' skill in the ar^.. These procedures include, but are not limited to, DNA- 
, DNA DNA-RNA nyondization ana protein bioassay or immunoassay techniques 
wh-cr-. ,r-c,..e r...orane-baee., solution-based, or chip-based technologies for the 
detection and/or quantification of the nucle.c acid or protein. 

The presence of the cysteine polynucleotide sequence can be detected by 
DNA-DNA or DNA-RNA hybnd^ation or amplification using probes, portions or 
5 fragments of B.subtil.s SP1 , SP2, SP3, SP4 or SP5. 
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v„ As^a^oLmieH^^ ,„,o,.oseo.« in .he art .or detecting and 

There .re ».rtous .sseys ""f,^^,, release of acd- 

.eas.„n, protea.e aetivKy. -^"^^e red as ...o^anca at 2S0 nn, or 
.oluOle pepttdes .rom case, or ^, „ , ,3e4. Methods o. 

..c„.etr,ca„V us,n. ^^'^^ ^ZZ^^ and their tnhihitors, Veria, 
Enzymatic Analysis vot. 5, PeP" solubilization of chromogen.c 

one^le. in Microbial Enzymes and Biotechnology - 

;r::arar:poi-s-e,undon.p....3.,. - - 

^-'^^^^^-^'''''^'^^^•o. o, a he.,ro,o,o.s or homologous 
Means for determining the ,e ^^^^^^^^ U3,„, 

prcteln in a gcam-positive ^ ,,,„„ne prc.ein. Examples incl.de 

.,,ner polyclonal or monoc.^ an -o^. - and 
,„,,„e-lin.ed imn.unoscrt=en, 3 say ( ^^^^^ ^^^^^^ ^^^^ 

„„n,escent activated cell sowing (FACS ^^^^^^..^^u^y 
,„„„g other pieces, in Hampton R ' '^^^^^^^^^^^yTE;;;^^ 158:12,1). 
^ Press. S. :;;::Lhni:es are .nown hy those 

^ rV::r e^rm :ar-ll nuCc and amine a.d assays. Means 
0 skilled in the art and can be u ^^^.^^ 

'-™-='-'^-''^ttl::;oia *.nic..ranslat,on,end-,at,e.ingorPC. 
polynocleotide « ^„.„,,.e„. the nucleotide seduence. or 

3.pli.,cation using a i^^— ^ ^'^^.^^ ,,o,uotion o, an mRNA probe. 

,ny port,on o, it, may be doned i ^^^^^^^^^ ^^^^^^^^ ^ 

■ ::::r::r::;::«-,addi.,ono,anappropna.eK.APolymerase^ 

,3 o, sPB and labeled .p.^ataway N.,, 

« number o. compantes such as Phann ^ 

_oga .Mad,scn Wl,, ..... molecu,es or 

oommercial .ts and -'-^'^^^Ji^^. «"orascent. chem,lum,nescent, or 
,3hels include those cotactors. inhibitors, magnetic part.oles 

e„,„mogen,c agents as we as uO ^^^^^^ ^^^^^^^ 3^,, 33, 

,ne like Patents teach.ng the use ^ ^^^^ 

3,BS0,752: 3.93«S0: 3.996,3.5; 4.277,«7. .,276,, ^^^^ 
,ecomb,nan. ,mmunog,obul,ns ma, be produced as 
■ . 6,6 567 and .ncorporated hetain by reference. 



10 



GC382 



- 16 - 
GC 382 



^"-^ ) IX Purification of Proteins 

Gram positive host cetls transformed with polynucleotide sequences 
encoding heterologous or homologous protein may be cultured under conditions 
5 suitable for the expression and recovery of the encoded protein from cell culture. 
The protein produced by a recombinant gram-positive host cell comprising a senne 
protease of the present invention will be secreted into the culture media. Other 
recombinant constructions may Join the heterologous or homologous polynucleotide 
sequences to nucleotide sequence encoding a polypeptide domain which will 
10 facilitate purification of soluble proteins (Kroll DJ et al (1993) DNA Cell Biol 12:441- 
53). 

Such purification facilitating domains include, but are not limited to, metal 
chelating peptides such as histidine-tryptophan modules that allow purification on 
immobilized metals (Porath J (1992) Protein Expr Purif 3:263-281). protein A 
15 domains that allow purification on immobilized immunoglobulin, and the domain 
) utilized in the FLAGS extension/affinity purification system (Immunex Corp, Seattle 

WA). The inclusion of a cleavable linker sequence such as Factor XA or 
enterokinase (Invitrogen. San Diego CA) between the purification domain and the 
heterologous protein can be used to facilitate purification. 

20 

X. Uses of The Present Invention 

Genetically Enqineered Host Cells 

The present invention provides genetically engineered host cells comprising 
preferably non-revertabie mutations or deletions in the naturally occurring gene 
2f encoding one or more of SP1 , SP2, SP3. SP4 or SP5 such that the proteolytic 
activity is diminished or deleted altogether. The host cell may contain additional 
protease deletions, such as deletions of the mature subtilisn protease and/or mature 
neutral protease disclosed in United States Patent No. 5,264.366. 

In a preferred embodiment, the host cell is genetically engineered to produce 
30 a desired protein or polypeptide. In a preferred embodiment the host cell is a 
BacHlus. In another preferred embodiment, the host cell is a Bacillus subtilis. 

in an alternative embodiment, a host cell is genetically engineered to produce 
a gram-positive SPi , SP2, SP3. SP4 or SP5. In a preferred embodiment, the host 
cell is grown under large scale fermentation conditions, the SP1. SP2, SP3, SP4 or 
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SP5 is isolated and/or purified and used in cleaning compositions such as 
detergents. WO 95/1 061 5 discloses detergent fomiuJation. 

Polynucleotides 

5 A B.subtlis SP1, SP2, SP3. SP4 or SP5 polynucleotide, or any part thereof, 

provides the basis for detecting the presence of gram-positive microorganism 
polynucleotide homologs through hybridization techniques and PGR technology. , 

Accordingly, one aspect of the present invention is to provide for nucleic acid 
hybridization and PGR probes which can be used to detect polynucleotide 
10 sequences, including genomic and cDNA sequences, encoding gram-positive SP1, 
SP2. SP3, SP4 or SP5 or portions thereof 

The manner and method of carrying out the present invention may be more 
fully understood by those of skill in the art by reference to the following examples, 
15 which examples are not intended in any manner to limit the scope of the present 
invention or of the claims directed thereto 

Example I 

Preoaration of a Genomic library 
20 The following example illustrates the preparation of a Bacillus genomic 

library. 

Genomic DNA from Bacillus cells is prepared as taught in Current Protocols 
In Molecular Biology vol. 1, edited by Ausubel et al., John Wiiey & Sons, Inc. 1987, 
chapter 2. 4.1 . Generally, Bacillus cells from a saturated liquid culture are lysed and 

25 the proteins removed by digestion with proteinase K. Cell wall debris, 

poiysacchandes, and remaining proteins are removed by selective precipitation with 
CTAB. and high molecular weight genomic DNA is recovered from the resulting 
supernatantlDy isopropanol precipitation. If exceptionally clean genomic DNA is 
desired, an additional step of purifying the Bacillus genomic DNA on a cesium 

30 chlonde gradient is added. 

After obtaining purified genomic DNA, the DNA is subjected to Sau3A 
digestion. Sau3A recognizes the 4 base pair site GATC and generates fragments 
compatible with several convenient phage lambda and cosmid vectors. The DNA is 
subjected to partial digestion to increase the chance of obtaining random fragments. 

35 The partially digested Bacillus genomic DNA is subjected to size fractionation 

on a 1% agarose gel prior to cloning into a vector. Alternatively, size fractionation on 
a sucrose gradient can be used. The genomic DNA obtained from the size 
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fractionation step is purified away from the agarose and ligated into a cloning vector 
appropriate for use in a host cell and transformed into the host cell. 

Example II 

The following example describes the detection of gram-positive 
microorganism SP1, The same procedures can be used to detect SP2, SP3, SPA or 
SP5. 

DNA derived from a gram-positive microorganism is prepared according to 
the methods disclosed in Current Protocols in Molecular Biology, Chap. 2 or 3. The 
nucleic acid is subjected to hybridization and/or PCR amplification with a probe or 
primer derived from SPl. A preferred probe comprises the nucleic acid section 
encoding conserved amino acid residues. 

The nucleic acid probe is labeled by combining 50 pmoi of the nucleic acid 
and 250 mCi of [gamma ^^p] adenosine triphosphate (Amersham, Chicago IL) and 
T4 polynucleotide kinase (DuPont NEN®, Boston MA). The labeled probe is purified 
with Sephadex G-25 super fine resin column (Pharmacia). A portion containing 10^ 
counts per minute of each is used in a typical membrane based hybridization 
analysis of nucleic acid sample of either genomic or cDNA origin. 

The DNA sample which has been subjected to restriction endonuclease 

digestion is fractionated on a 0.7 percent agarose gel and transferred to nylon 
membranes (Nytran Plus. Schleicher & Schuelt. Durham NH). Hybridization is 
carried out for 15 hours at AO degrees C. To remove nonspecific signals, blots are 
sequentially washed at room temperature under increasingly stringent conditions up 
to 0.1 X saline sodium citrate and 0.5% sodium dodecyi sulfate. The blots are 
exposed to film for several hours, the film developed and hybridization patterns are 
compared visually to detect polynucleotide homologs of B.subtilis SPl. The 
nomologs are subjected to confirmatory nucleic acid sequencing. Methods for nucleic 
acid sequencing are wed known in the an. Conventional enzymatic methods employ 
DNA polymerase Klenow fragment, SEQUENASEcg) (US Biochemical Corp, 
Cleveland. OH) or Taq polymerase to extend DNA chains from an oligonucleotide 
primer annealed to the DNA template of interest. 

Various other examples and modifications of the foregoing description and 
examples will De apparent to a person skilled in the art after reading the disclosure 
wfthout departing from the spint and scope of the invention, and it is intended that all 
such examples or modifications be included within the scope of the appended 
claims. Ail publications and patents referenced herein are hereby incorporated in 
their entirety. 
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CLAIM 

We claim: 



5 1. An isolated polynucleotide encoding SP2 from a gram positive 



microorganism. 



2. The polynucleotide of Claim 1 wherein SP2 has the amino acid sequence shown 
in Figures 7A-7B. 



10 3. An isolated SP2 encoding nucleic add having the nucleic acid sequence as 
shown in Figures 7A-7B. 



4. An isolated SP2 from a gram-positive microorganism. 

1 5 5. The isolated SP2 of Claim 4 having the amino acid sequence as shown in Figure 
7A-7B. 



25 



6. An isolated polynucleotide encoding SP3 from a gram positive microorga 



nism. 



20 7. The polynucleotide of Claim 6 wherein SP3 has the amino acid sequence shown 
tn Figures 8A-8B. 



8. The isolated SP3 encoding nucleic acid having the sequence as shown in Figure 
8A-8B. 



>m. 

" ' in 



9. An isolated SP3 from a gram-positive microorganis 

10. The isolated SP3 of Claim 9 having the amino acid sequence as shown 
Figures 8A-8B. 

11 . A gram-positive microorganism having a mutation or deletion of part or all of one 
30 or more of the genes encoding sehne proteases selected from the group 

consisting of SP1 . SP2. SP3, SP4 and SP5 said mutation or deletion resulting in 
the inactivation of the CP1 proteolytic activity. 

12. The gram-positive microorganism according to Claims 11 that is a member of 
35 the family Bacillus. 
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,3 T.e .laoor^anls. according to Cai. 12 wh.re,n ,he nnam^r ,s s.,ec«dJro™ 

B aLlophnus. B. a.ylo»<,ue,aciens, B. coagu.ans. B. arculan.. B. ,au,u. and 

5 Bacillus thuringiensis. 

r, cm of Claim 1 1 wherein said microorganism is capable of 
14. The microorganism oT i^iaim i i wnc 

expressing a heterologous protein. 

,0 15 T^e .icoorganis. o. Cain, 1. wner^n said heterologous protein . ee,ec,.d 
,rcm ttte group consisung 0, hc^one. enzyme, growth ,a«or and cy.o.,ne. 

16. The hos, eel, o. Ciain, 15 wherein said he.eroiogou. protein is an enzyme. 

„ ,7 The hos. ce„ o, Calm 16 wherein said enzyme is selected from the group 
consisting 0. a proteases, carbohydrases, and lipases; J"*;/ 
racemases, epimerases. tau.omerases. or mutases; transferees. Krnases and 

phophatases. 

,S A Cleaning composition compnsing a sehne protease selected from the group 

( «^P9 SP3 SP^ and SP5. 
consisting of SP i . 

exoress.on vecor compr.s.n. nucle.c ac.d encod^g a serine protease selected 
^ from Lhc group cons.snng of SPL SP2, SP3, SP5 or SP5. 

,5 22. A host cell compnsme an expression vector according to Claim 21. 
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ABSTRACT 

The present invention relates to the identification of novel serine proteases in 
Gram-positive microorganisms. The present .nvention provides the nuclei aad and 
amino acid sequences for the Bacillus subtil.s serine proteases SP1 . SP2, SP3, SP4 
and SP5 The present .nvention also provides host cells havmg a mutation or 
deletion of part or all of the gene encoding SP1 , SP2, SP3. SPA and SP5. The 
present invention also provides host cells farther comprising nucleic acid encoding . 
desired heterologous proteins such as enzymes. The present invention also 
provides a cleaning composition comprising a serine protease of the present 



invention. 
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10 30 
atgaaaaagctgataaccgcagacgacatcacagcgattgtctc tg tg 
MKKLITADDITAIVSV 

50 70 90 

accgatcctcaatacgccccagacggtacccgtgccgcatatgtaaaa 

TDPQYAPDGTRAAYVK 

110 130 
tcacaagtaaatcaagagaaagattcgtatacatcaaatatatggatc 
S Q V N 0 E K D S Y T S u I W I 

150 170 190 

catgaaacgaaaacgggaggatctgttccttggacacatggagaaaag 
YETKTGGSVPWTHGEK 

210 230 
cgaagcaccgacccaagatggtctccggacgggcgcacgcttgccttt 
RSTDPRWSPDGRTLAF 

250 270 2 

atttctgatcgagaaggcgatgcggcacagctttatatcatgagcact 
I SDREGDAAQLYIMST 

90 310 330 

gaaggcggagaagcaagaaaactgactgatatcccatatggcgtgtca 

EGGEARKLTDIPYGVS 

350 370 
aagccgctatggtccccggacggtgaatcgattctggtcactatcagt 
KPLVJSPDGESILVTIS 

390 410 430 

r tgggagagggggaaagcaccgatgaccgagaaaaaacagagcaggac 
LGEGESIDDREKTEQD 

450 470 
age ta tgaacc tgt tgaagtgcaaggcc tctcc tacaaacgggacggc 
SYEPVEVQGLSYKRDG 

490 510 5 

aaagggc tgacgagaggt gcgtatgcccagc t tgtgc t tgtcagcgta 
K G L T R _G A Y A Q L V L V S V 

30 550 570 

aagcccggtgaga tgaaagagc tgacaagtcacaaagc tga tcacggt 
KSGEK KELTS HKADHG 

5 9 0 610 
garcc tgctr c t cctcctgacggcaaatggcttgttt tctcagctaat 
DPAFSPDGKWLVFSAN 

630 650 670 

tcaac cgaaacagarga tgccagcaagccgca tgatg 1 1 taca taatg 
LTETDDASKPHDVYIM 

690 710 



2 



tcactggagtctggagatctcaagcaggctacacctcatcgcggctca 
S L E S G D L K Q V T ? H R G S 



730 750 7 

ttcggatcaagc tcatttt caccagacggaaggtatcttgct ttgctt 

LALL 



FGSSSFSPDGR 



7 0 



790 810 
ggaaa tgaaaaggaat a taagaatgctacgctctcaaaggcgtggctc 
^^NEKEYKNATLSKAWL 



830 



850 



YDIEQGRLTCLTEML 



D 



870 



890 910 
gt tea tttagcggatgcgctgattggagau teat tgatcggtggtgct 
VKLADALIGDSLIGGA 

930 950 
gaacagcgcccgatttggacaaaggacagccaagggttttatgtcatc 
EQRPIWTKDSQGFYVI 

570 990 10 

ggcacagatcaaggcagtacgggeatctartatatttegattgaaggc 
GTDQGSTGIYYisi-EG 

1030 



10 
e 



1050 



LVYPIRLEKEYINSF 

1070 1090 
ctttcacctgatgaacagcac tttattgccagtgtgacaaagccggac 
LSPDEQHFIASVTKPD 

1110 1130 1150 

agaccgagtgagctttacagtatcccgct-gaacaggaagagaaacag 
RPSELYS I PLGQEEKQ. 

1170 1190 
ctgactggcgcgaatgacaagtttgtcagggagcatacgatatcaata 
LTGANDKFVREKTISI 

1210 1230 12 

cctgaagagattcaatatgctacagaagacggcgtgatggtgaacggc 
P ^ ^ 1 Q A T E D G V M V N G 

50 1270 1290 

iggctgatgaggcctgcacaaatggaaggtgagacaacatatccact t 



VJ L M R P A Q M E G E 



T Y P L 



1310 1330 

attcttaacatacacggcggrccgcatatgatgtaeggacatacatat 
II^NIHGGPHMMYGKTY 

13S0 1370 1390 

t t teat gagtttcaggtgetggcggcgaaaggatacgcggccgtt tat 
^ H E F Q V L. A A K G Y A V V Y 

1410 1430 



h / £: IL'U' I /3 



atcaatccgagaggaagccacggc tacgggcaggaat t: tgtgaa tgcg 
INPRGSHGYGQHFVNA 

1450 1470 14 

gtcagaggagactatgggggaaaggat catgacgatgtgatgcaggct: 
VRGDYGGKDYDDVMQA 

90 1510 1530 

gtgga tgaggc tatcaaacgagatccgcatattgatcctaagcggc tc 
VDEAIKRDPHIDPKRL 

1550 1570 
ggtgtcacgggcggaagc tacggaggttttatgaccaac tggatcgtc 
G V T G G S Y G G F M T N W I V 

1590 1610 1630 

gggcagacgaaccgctttaaagc tgccgttacccagcgc tcgatatca 
GQTNRFKAAVTQRSI S 

1650 1670 
aattggatcagctttcacggcgtcagtgatatcggctatttc tttaca 
NWISFHGVSDIGYFFT 

1690 1710 17 

gac tggcagcttgagcatgaca tgt t tgaggacacagaaaagctctgg 
DWQLEHDMFEDTEKLW 

30 1750 1770 

gaccggtc tcctttaaaatacgcagcaaacgtggagacaccgcttttg 
DRSPLKYAANVETPLL 

1790 1810 
atactgcatggcgagcgggatgaccga tgcccgatcgagcaggcggag 
ILHGERDDPCPIEQAE 

1830 1850 1870 

cage cgr t tatcgc tctgaaaaaaacgggcaaggaaaccaagc ttgtc 
QLFIALKKMGKETKLV 

1890 1910 
eg 1 1 1 tccgaatgcatcgcacaa r c tatcacgcaccggacacccaaga 
RFPKASKNLSRTGHPR 

1930 1S50 19 

cagcggatcaagcgcc tgaat ta Laccagc ticatggli t tgatcaacat 
QR IKRLNYISSWFDQH 

70 
czc 
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dap2_yeas t 
YUXL 

dap2_yeast 
YUXL 

dap2__yeas t 
YUXL 

dap2_yeast 
YUXL 

dap2_yeas t 
YUXL 

dap2_yeas t 
YUXL 

dap2_yea s t 
YUXL 

dap2_yeas t 
YUXL 



170 



Initn: 165 Opt: 204 z-score: 227.4 E() 
20.3% identity in 646 aa overlap 

180 



3e-0 



150 200 210 220 

WRHSTFGSYFVYDKSSSSFEEIGNEVALAIWSPNSNDIAYVQDN-NIYIYSAISKKTIRA 

: : I : : : I i 1 : : : I | i : | 

MKKLITADDITAIVSVTDPQYAPDGTRAAYVKSQVNQEKDSYTSNIWIYE 
10 20 30 40 50 

230 240 250 260 270 280 

VTNDGSSFLFNGKPDWVYEEEVFEDDKAAWWSPTGDYLAFLKIDESEVGEFIIPYYVODE 
• • " I 1:: I: I : III | IN:: |:::::: | :: , 

-P-WTHGEKRSTDPR WS PDGRTLAFISDREGDAAQL YIMSTE 

60 70 80 90 



TKTGGSV- 



290 300 310 320 330 

KDIYPEMRSIKYPKSG— TPNPHAELWVYSMKDGTSFHPRISGNKKDG— SLLITEVTW 
: : : I i I : I : : : I : I : : | | : | : : : ] : : ... 

GGEARKLTDIPYGVSKPLWSPDGESILVTISLGEGESIDDR-EKTEQDSYEPVEVOGLSY 
100 110 120 130 140 150 



340 



350 



360 370 380 390 

VGNGNVLVKTTDRSSDILTVFLIDTIAKTSNWRNE SSNGGWWEITHNTLFI PANE 

: I • I : : : : : : : I : il : : : I : I I : : | : • - 

KRDGKGLTRGAYAQLVLVSVKSGEMKELTSHKADHGDPAFSPDGKWLVFSAN---LTETD 
160 170 180 190 200 



210 



400 



410 



420 430 440 

TFDRPHNGYVDILPIGGYN HLAYFENSNSS — H YKTLTEGKWEVVNGPLA F 

- " : I : 1 i : ! : I :| : j : l : | : : | , . , . 

DASKPHDVYIMSLESGDLKQVTPHRGSFGSSSFSPDGRYLALLGNEKEYKNATLSKAWLY 
220 230 240 250 



260 



270 



450 



460 



470 480 490 4 99 

DSMENRLYFISTRKSSTERHVYYID-LRSPNEIIEVTDTSEDGVYDVSFSSGRRFGL--L 



: : II 



I 



I : 



DIEQGRLTCLTEMLDVHLADALIGDSLIGGAEQRPIWTKDSQGFYVIGTDQGST-GIYYI 
280 290 300 310 320 



330 



500 



510 



520 530 540 550 

TYKGPKVPYQKIVDFHSRKAEKCDKGNVLGKSLYHLEKNEVLTKILEDYAVPR-KSFREL 
: ^ i i : : : : : : I : : : : : : : ■[ : | . . j , . . , 

SIEGLVYPIRLEKEYINSFSLSPDEQHFIASVTKPDRPSEL- YSIPLGOEEKOL 

340 350 360 370 380 

560 570 580 590 600 

NLGKDEFGKD ILVNSYEILPNDFDETLSDHYPVFFFAYGGPNSQ 

' • ■ I • f - • : : M : : : I : : : : M : : : : | | ! : : 

TGANDKFVREHTISIPEEIQYATEDGVMVNGWLMRPAQMEGETT--YPLILN1HGGPH-M 
390 400 410 420 430 440 



610 620 630 640 650 660 

dap2_yeast QWKTFSVGFNEWASQLNAIWVVDGRGTGFKGQDFRSLVRDRLGDYEARDQISAAS-L 

: : 1 : | : | : | : : I I I : : I I : I I : I : I I I : I : : I : : 
YUXL MYGHTYFHEF-QVLAAKGYA-WYINPRGSHGYGQEFVNAVRGDYGGKDYDDVMQAVDEA 

450 460 ' . 470 480 ' 490 500 

670 680 690 700 710 720 

dap2_yeast YGSLTFVDPQKISLFGWSYGGYLTLKTLEKDGGRHFKYGMSVAPVTDWRFYDSVYTERYM 

: I I : : : : : I I I I I : : 1 : : : : I I : : : : : : 1 : : I I : 

YUXL IKRDPHIDPKJILGVTGGSYGGFMTNWIVGQTN — RFKAAVTQRSISNWISFHGVSDIGYF 

510 520 530 540 550 

730 740 750 x^/ 7 60 770 

dap2_yeast HTP-QENFDGYVES-SVHNVTALAQANR FLLMHGTGDDNVHFQNSLKFLDLLDLNG 

I I : I ::::::: I I : I : : I I II ::::::: | | 

YUXL FTDWQLEHDMFEDTEKLWDRSPLKYAANVETPLLILHGERDDRCPIEQAEQLFIALKKMG 
560 570 . 580 590 600 610 

rH«3 

780 1% 800 810 

daD2_veast VENYDVHVFPDSDHS IRYHNANVIVFDKLLDWAPCRAFDGQFVK 

1 : i : II : : : i : : 
YUXL KETKLVR-FPNASHNLSRTGHPRQRIKRLNYISSWFDQHL 
620 630 640 650 



1 ^/U 



as 



SCORES Initl: 45 Initn : 45 Opt: 178 z-score: 198.3 E(): 1.26 

05 

Smith-Waterman score: 215; 23.2% identity in 254 aa overlap 

380 390 400 410 420 430 439 

yuxl .bsupep QEEKQLTGANDKEVREHTISIPEEIQYATEDGVMVNGWLMRPAQMEGETTYPLILNIHGG 

: I : I : I I : I I : I : I : I : : I I 

'^IVEKRRFPSPSQHVRLYTICYLSNGLRVKGLLAEPAE-PGQ— YDGFLYLRGG 
10 20 30 40 50 

, ^ 450 460 470 480 490 

yuxl . bsupep PHMMYGHTYFHEFQVLAAKGYAWYINPRGSHG-YGQEFVNAVRGDYGGKDYDDVMQAVD 

YTMA IKSV-GMVRPGRI IQFASQGFWFAPFYicNQiGEGNE l!)FAGEiREl!)AFSAF- 

60 70 80 -90 100 

500 510 520 530 540 550 

yuxl . bsupep EAIKRDPHIDPKRLGVTGGSYGGFMTNWIVGQTNRFKAAVTQRSISNWISFHGVSDIGYF 
• • = = i : : I : : I I I I : I I : : : : : | ■ • i i i i i • 

YTMA RLLQQHPNVKKDRIHIFGFSRGGIM GMLTAIEMGGQAASFVSW— GGVSDMILT 

110 120 f^^^ 130 140 150 

, ^ 570 580 590 600 

yuxl .bsupep FTDWQLEHDMFEDT EKLWDRSPLKYAANVETPLLILHGERDDRCPIEQAE 

YTMA YEERQDLRRMMKRVIGGTPKKVPEEYQW-RTPFDQVNKIQAPviLIHGEKDQNVsioH^ 
160 170 180 190 200 210 

610 620 630 640 650 P 

yuxl . bsupep QLFIALKKMGKETKLVRFPNASHNLSRTGHPRQRIKRLNYISSWFDQHL 

I I I : : I : : : : : | : | ; | . 

YTMA LLEEKLKQLHKPVETWYYSTFTHYFP PKENRRI VRQLTQWMKNR 

220 230 ^^,<, 240 250 



SCORES initl: 58 Initn: 84 Opt: 153 z-score : 17 1 . 4 E ( ) : 0 . 000 

Smith-Waterman score: 153; 23.9% identity in 243 aa overlap 

-^20 430 440 450 460 

yuxl . bsupep PEEIQYATEDGVMVNGWLMRPAQMEGETTYPLILNIHGGPHMMYGHTYFHEFQVLAAKGY 

^^^^ M^QIENQTVSGIPFLHIVKEENRHRAVPLviFIHGFTSAKE-HN-LHIAYLLAEKGF 

10 20 30 40 50 

, ^ ^'70 480 ' 490 500 510 

yuxl . bsupep AWYINPRGSHGYGQEFVNAVRGDYGGKDYDDVMQAVDEA IKRDPHIDPKRLGV 

= ' I : : I : I : : : : : : I : : | | : : : : | | , ■ . , . 

YITV RAVL— PEALH-HGERGEEMAVEELAGHFWDIVLNEIEEIGVLKNHFEKEGLIDGGRIGL 

^° "70 80 90 100 110 

, ^ ^20 530 540 550 560 570 

yuxl . bsupep TGGSYGGFMTNWIVGQTNRFKAAVTQRSISNWISFHGVSDIGYFFTDWQLEHDMFED-TE 
= ' ' = I ' = I : : I I : 1 : : | : : : : : | : : : : | | : | : : : 

^^TV AGTSMGGITTLGALTAYDWIKAGVSLMGSPNYVELFQ-QQIDHI-QSQGIEIDVPEEKVQ 
-^120 130 140 150 160 170 

580 590 600 610 620 

yuxl . bsupep KLWDRSPLKYAANV ETPLLILHGERDDRCPIEQAEQLFIALKKMGKET KLV 

: I i I : : : I i I : I I : I I ::::::: | : : | 

QLMKRLELRDLSLOPEKLQQRPLLFWHGAKDKWPYAPTRKFYDTIKSHYSEQPERLQFI 
180 190 200^^^^ 210 • 220 230 

630 . 640 650 

yuxl . bsupep RFPNASHNLSRTGHPRQRIKRLNYISSWFDQHL 

I I : I : : II : I : I I I : : | 

YITV GDENADHKV PRAAV — LKTIE-WFETYL 

'f^h^iS 240 250 



/I TV 
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SCORES initl: 67 Initn: 67 Opt: 117 z-score : 131 . 5 E () : 0, 

Smith-Waterman score: 117; 21.6% identity in 232 aa overlap 

^00 410 420 430 440 



yuxl . bsupep TGANDKFVREHTISIPEEIQYATEDGVMVNGWLMRPAQMEGETTYPLILNIHGGP-HMMY 

YQKD "Jf^DNGHDVFESFEQMEKTAFVIPSAYGYDiKGYHVAPHDipNTlilCHGVTMNVL^ 
^ 60 70 80 90 

450 460 470 



yuxl . bsupep GHTYFHEFQVLAAKGYAWYINPRGiHGYGQEF^NAVRGDYGGKD^ 

VQKD SLKYMHLFLDL---GWNVLIYDHR-RHG^^ GGKTTsiGFYEKiiLNKWSLLKNKT 

^■^^ 120 130 

yuxl. bsupep phidpLlgvtggsyggfmtnwiv^^^ 

YQKD ,^S^^--^^^I"^|SMGAVTALLYAGAHCSDGADFY^ 

ie?6^r//v£i 170 180 190 200 

yuxl.bsupep''?pWQLEH--DMFEDTE---KLWDRS^ 

VQKD ^2"„P^^P?^D^^LKLRGGYRAREVSpiAVIDKiEKiviFiisK^ 

230 240 2t0A^p 260 

1 ^ ^20 630 640 650 

yuxl . bsupep KKMGKETKLVRFPNASHNLSRTGHPRQRIKRLNYISSWFDQKT 
I I I : : : I : I :J I 

^QKD K^-'^GP'^ALYIA-ENGEHAMSYTKNRHTYRKTVQEFLDNMNDST" 

270 ^80 290 300 



• 1l# 



SCORES Initl: 66 Initn: 90 Opt: 114 z-score: 128.0 E(): 0. 

1 

Smith-Waterman score: 152; 25.1% identity in 303 aa overlap 

330 340 350 360 370 379 

yuxl bsupep GTDQGSTGIYYISIEGLVyPIRLEKEYINSFSLSPDE-QHFIASVTKPDRPSELYSIPLG 

I : I I : I : i I I I : : : : I 

CAH MQLFDLPLDQLQTYKPEKTAPKDFSEFWKLSLE 

10 20 30 



380 390 400 410 420 430 

yuxl .bsupep QEEKQLTGANDKFVREHTISi P-EEIQYATEDGVMVNGWLMRPAQMEGETTYPLILNIHG 
: I : : : ! : : : : : I : : : : : I I I : I I : | | : : | I 
CAH ELAKVQAEPDLQPVDYPADGVKVYRLTYKSFGNARITGWYAVPDK-EGP — HPAIVKYHG 

40 50 60 70 80 90 



440 450 460 470 480 

yuxl . bsupep GPHMMYGHTYFHEFQVLAAKGYAV VYINPRGSHGYGQEFVNAVRGD- 

: I : : M : 1 : I I ! : : I : I : 1 I : 1 : : ! 

CAH YNASYDGE — IHEMVNWALHGYATFGMLVRGQQSSEDTSISPHG-HALGWMTKGILDKDT 

100 110 120 130 140 



490 500 510 520 530 540 

yuxl . bsupep — YGGKDYDDVMQAVDEAIKRDPHI DPKRLGVTGGSYGGFMTNWIVGQTNRFKAAVTQRS 

11 I I : : : I : I : I : : : I I : ! 1 I ! i i 1 I : I : : : : I I 1 I : : 
CAH YYYRGV-YLDAVRAL-EVISSFDSVDETRIGVTGGSQGGGLTIAAAALSDIPKAAVADYP 
150 160 170 -^^^^B/^ 

550 560 570 580 590 

vuxl . bsupep -ISNWISFHGVS DIGYFFTDWQLEHDMFEDTEKLWDRSPLKYAANVETPLLILH 

: 1 I : _ I : : ! : t i : : : I : : : I 1 : : 1 : 1 : 

CAH yLSNFERAIDVALEQPVLElNSFFRRNGSPETEVQAMKTLSYFDIMNLADRVKVPVLMSI 
210 220 230 240 250 260 



600 610 620 630 640 650 

yuxl . bsuDeD GERDDRCPIEQAEQLFI ALKKH- -GKETKLVRFPNASHNLSRTGHPRQRIKRLNYISSV7F 

i 1 I : I ! : : : 111:!: X 

CAH GLIDKVTP PSTVF.A.^-.YNHLETK?CELKVYRYFGHEYI PAFQTEKLAFFKQHLKG 

270 2S0 290 foO^IS 310 



fi\o^ Lire (/^ 



• ''"115 • 



10 30 
tigattgtagagaaaagaagati: tccgtcgccaagccagcatgtgcgt 
IVEKRRFPSPSQHVR 



70 90 
t cgtatacgatctgctatczgtcaaatiaaattacaggiitiaaggggc tt 
^VTICYLSNGLK~'VKGL 

110 130 
ctggctgagccggcggaaccgggacaatacgacggacttttatatttg 

laepaepgqydg'flyl 

150 170 190 

cgcggcgggattaaaagcgtgggcatggttcggccgggccggattatc 

^^ggiksvgmvr'^p'grii 



210 230 
cagrttgcatcccaagggtttgtggtgtttgctcctttttacagaggc 
OFASQGFVVFAPFYRG 

250 270 2 

aatcaaggaggagaaggcaatgaggattttgccggagaagacagggag 
NQGGEGNEDFAGEDRE 

90 310 330 

aatgcattttctgctttitcgcctgcttcagcagcacccaaatgtcaag 




370 

itttcccgcggcggaat tatggga 
GFSRGGIMG 



390 410 430 

atigctcac tgcgatcgaaa tgggcgggcagg eager teat ttgtt tec 
^- ^ '"^ A I E M G G Q A A S F V £ 



450 470 
tgcggaggcgtcagtgatatgattcr tacatacgaggagcggcaggat 
''"'^'""'^'^^''LTYEERQD 



G V SDK 



490 510 5 

r tgeggcgaatgargaaaagagteatcggcggaacaecgaaaaaggtg 
1 R R M M R V I G G T P K K V 



550 570 
ec tgaggaa taccaatggaggacacegtt tgaccaagcaaacaaaatt 

0 V N K I 



Q W R T P F D 



590 610 
cacgc tcccgcgc tgttaa tee a cggagaaaaagaccaaaatgtt teg 
Q A P V L L I H G E K D Q N V S 

530 650 670 

an tea gcactce tat ttattagaagagaagccaaaacaactgcataag 
Q H S \' L L E E K L K Q L H K 



6 90 

-? 9 '1 
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ccggtiggaaacacggcactacagtacat 
P V E T W V Y S T F T K Y F P P 

730 750 7 

aaagaaaaccggcgtatcgtgcggcagctcacacaatggatgaaaaac 
KENRRIVRQLTQWMKN 

70 

cgc 

R 




50 70 90 

attgtaaaggaagagaacaggcaccgcgc tgt tcctc tcgtgatcct t 
IVKEENRHRAVPLVIF 

110 130 
atacatggttttacaagcgcgaaggaacacaaccttcatattgcttat 
IHGFTSAKEHNLKIAY 

150 170 190 

ctgcttgcggagaagggttttagagccgttctgccggaggctttgcac 
LLAEKGFRAVLPEALH 

210 230 
catggggaacggggagaagaaatggctgttgaagagctggcggggcat 
HGERGEEMAVEELAGH 

250 270 2 

ttttgggatatcgtcctcaacgagattgaagagatcggcgtacttaaa 
FWDIVLNEIEEIGVLK 

50 - 310 330 

aaccattttgaaaaagagggcctgatagacggcggccgcatcggtc tc 
NKFEKEGLIDGGRIGL 



350 



370 



gcaggcacgtcaatgggcggcatcacaacgcttggcgctttgactaca 
A GTS MGGITTLGALTA 

3 9 0 410 : , 4 3 0 

Latgattggataaaagccggcgtcagcctgataagaagcccgaattac 
Y D W T 




450 

gtggagc tgtttcagc 
^ ^ F Q Q Q 

490 510 5 

gaaatcgatgtigccggaagagaaggtacagcagctigatgaaacgtctc 
Eli:)VP^E£KVQQLMKRL 

30 550 570 . 

gagt tgcgggatc tcagcc ttcagccggagaaactgcaacagcgcccg 
HLRDLSLQPEKLQQRP 

590 610 
c tut t tat tt tggcacggcgcaaaagataaagttgtgccttacgcgccg 
^ L E v: H G A K D K V V P y A P 

^30 650 670 

acccggaaat t matgacacgat taaa tcccattacagcgagcagccg 
T R K r Y D T I K S H Y S E Q P 

6 9 



710 



gaacgcc tgcaattta tcggagatg aaaacgctga cc ataaag tec eg 
ERLQFIGDENADHKVp" 

730 750 
cgggcagctgtgttaaaaacgattgaatggtttgaaacgtactta 
RAAVLKTIEW FETYL 



10 30 

ttgaagaaaatcc tt ctggccat cggcgcgcccgcaacagctgtcatc 
L K ^'^-K ILLAIGALVTAVI 

5 0.^ 70 90 

gcaatcggaattgt t tc ttcacatatgatcctattcaccaagaaaaaa 
AIGIVFSHMILFIKKK 

110 130 
acggatgaagaca ttatcaaaagagagacagacaacggacatgatgtg 
TDEDI I KRETDNGHDV' 

150 170 190 

tt tga a teat ttga a c aa a zggagaaaac c get tttgtgatac cc tec 

fesf'eqmektafvi PS 

210 230 
gcttacgggtacgacataaaaggataecatgtcgcaccgcatgacaca 
AYGYDIKGYHVAPHDT 

250 270 2 

ccaaatac ca teat cat ctgecacggggtgacgatgaatgtactgaat 
PNTII ICKGVTMNVLN 

90 310 330 

tc tct taagtata tgcat tza tttctagatctcggctggaatgtgc tc 

SLKYMHLFLDLGWNVL 

350 370 
att tatgaccatcgccggcatggccaaagcggcggaaagacgaccaac 
I Y D H R R H G Q S G G K T T s"^ 

390 410 430 

tacgggttttacgaaaaggatgatctcaataaggttgteagcttgctc 
YGFYEKDDLNKVVSLL 

450 470 
aaaaaeaaaacaaa tea tcgcggat tgatcggaat tcatggtgagtcg 
K N K T N H R G L I G I H G E S 

490 510 5 

atgggggcegtgaccgcce tgc t t tatgctggtgcaeactgcagcga t 
M G A V T A L L Y A G A H C S D 

30 550 570 

ggcgctgat t t t tata ttgcegattgtccgt tcgcatgtt ttgatgaa 
GADFYIADCPFACFDE 

590 610 
cage t tgcctatcggctgagagcggaatacaggctcccgtcttggccc 
Q L A Y R L R A E Y R L P S W P 

630 650 670 

etgct tec ta tcgccgac t cc 1 1 tt: tgaagetgaggggagge tategc 
L L P I A D r F L K L R G G Y R^ 



690 710 




gcacgtgaagtatctccgcc tgc tgtca t tgataaaaccgaaaagccg 
A R E V S P L A V I D K I E K P 

730 750 7 

gtcctctttattcacagtaaggatgatgactacattcctgttitcttca 
VLFIHSKDDDYIPVSS 

70 790 810 

accgagcggctttatgaaaagaaacgcggtccgaaagcgctgtacatt 
ter"'lyekkrg PKALYI 

830 850 
gccgag^acggtgaacacgccatgtcatataccaaaaatcggcatacg 
AENGEHAMSYTKNRH. T 

870 890 910 

taccgaaaaacagtgcaggagttt ttagacaacatgaatgattcaaca 

yr~'ktv"'qefi.dnmndst 



gaa 
E 



t 

I 

I 



